This project predicts the likelihood of a student being admitted to a university based on key academic and personal attributes. The model assists applicants in assessing their admission chances and provides universities with a data-driven approach to evaluating candidates.
Key Insights
GRE, TOEFL, CGPA, and research experience strongly influence admission probability.
University rating, SOP, and LOR have a moderate impact on admission chances.
Logistic Regression was chosen for its efficiency, interpretability, and suitability for binary classification.
Feature scaling (StandardScaler) improved model performance and ensured consistency.
Achieved 94% accuracy, with an F1-score of 91%, ensuring a balance between precision and recall.
ROC AUC score of 0.94, demonstrating strong model performance in distinguishing admission outcomes.
Deployed an interactive Streamlit web app for real-time user predictions.
Technical Implementation
Data Preprocessing:
Checked for missing values and handled inconsistencies.
Scaled numerical features using StandardScaler for better model performance.
Model Selection:
Implemented Logistic Regression, a reliable algorithm for binary classification.
Used GridSearchCV for hyperparameter tuning to optimize performance.
Model Evaluation:
Measured accuracy, precision, recall, and F1-score for performance assessment.
Generated a confusion matrix and ROC curve to evaluate classification performance.
Deployment:
Developed a Streamlit web app for real-time predictions.
Packaged the model using Pickle for efficient loading and inference.
Live Preview
Loading preview...
Video Preview
Key Learnings
Feature engineering is crucial: Selecting and scaling the right features significantly improves model performance.
Logistic Regression works well for binary classification with interpretable results.
Model evaluation should go beyond accuracy: Precision, recall, and F1-score provide deeper insights into performance.
Pipeline automation enhances efficiency and makes the workflow more reproducible.
Deploying ML models using Streamlit allows users to interact with predictions easily.
Real-world applications require continuous improvement, such as expanding the dataset or testing other models for further enhancement.